Goto

Collaborating Authors

 articulation model


Articulated Object Estimation in the Wild

Werby, Abdelrhman, Büchner, Martin, Röfer, Adrian, Huang, Chenguang, Burgard, Wolfram, Valada, Abhinav

arXiv.org Artificial Intelligence

Understanding the 3D motion of articulated objects is essential in robotic scene understanding, mobile manipulation, and motion planning. Prior methods for articulation estimation have primarily focused on controlled settings, assuming either fixed camera viewpoints or direct observations of various object states, which tend to fail in more realistic unconstrained environments. In contrast, humans effortlessly infer articulation by watching others manipulate objects. Inspired by this, we introduce ArtiPoint, a novel estimation framework that can infer articulated object models under dynamic camera motion and partial observability. By combining deep point tracking with a factor graph optimization framework, ArtiPoint robustly estimates articulated part trajectories and articulation axes directly from raw RGB-D videos. To foster future research in this domain, we introduce Arti4D, the first ego-centric in-the-wild dataset that captures articulated object interactions at a scene level, accompanied by articulation labels and ground-truth camera poses. We benchmark ArtiPoint against a range of classical and learning-based baselines, demonstrating its superior performance on Arti4D. We make code and Arti4D publicly available at https://artipoint.cs.uni-freiburg.de.


KinScene: Model-Based Mobile Manipulation of Articulated Scenes

Hsu, Cheng-Chun, Abbatematteo, Ben, Jiang, Zhenyu, Zhu, Yuke, Martín-Martín, Roberto, Biswas, Joydeep

arXiv.org Artificial Intelligence

Sequentially interacting with articulated objects is crucial for a mobile manipulator to operate effectively in everyday environments. To enable long-horizon tasks involving articulated objects, this study explores building scene-level articulation models for indoor scenes through autonomous exploration. While previous research has studied mobile manipulation with articulated objects by considering object kinematic constraints, it primarily focuses on individual-object scenarios and lacks extension to a scene-level context for task-level planning. To manipulate multiple object parts sequentially, the robot needs to reason about the resultant motion of each part and anticipate its impact on future actions. We introduce KinScene, a full-stack approach for long-horizon manipulation tasks with articulated objects. The robot maps the scene, detects and physically interacts with articulated objects, collects observations, and infers the articulation properties. For sequential tasks, the robot plans a feasible series of object interactions based on the inferred articulation model. We demonstrate that our approach repeatably constructs accurate scene-level kinematic and geometric models, enabling long-horizon mobile manipulation in a real-world scene. Code and additional results are available at https://chengchunhsu.github.io/KinScene/


Neural Implicit Representation for Building Digital Twins of Unknown Articulated Objects

Weng, Yijia, Wen, Bowen, Tremblay, Jonathan, Blukis, Valts, Fox, Dieter, Guibas, Leonidas, Birchfield, Stan

arXiv.org Artificial Intelligence

We address the problem of building digital twins of unknown articulated objects from two RGBD scans of the object at different articulation states. We decompose the problem into two stages, each addressing distinct aspects. Our method first reconstructs object-level shape at each state, then recovers the underlying articulation model including part segmentation and joint articulations that associate the two states. By explicitly modeling point-level correspondences and exploiting cues from images, 3D reconstructions, and kinematics, our method yields more accurate and stable results compared to prior work. It also handles more than one movable part and does not rely on any object shape or structure priors. Project page: https://github.com/NVlabs/DigitalTwinArt


Ditto in the House: Building Articulation Models of Indoor Scenes through Interactive Perception

Hsu, Cheng-Chun, Jiang, Zhenyu, Zhu, Yuke

arXiv.org Artificial Intelligence

Abstract-- Virtualizing the physical world into virtual models has been a critical technique for robot navigation and planning in the real world. We introduce an interactive perception approach to this task. After that, the robot collects the observations before and after the interactions. Virtualizing the real world into virtual models is a crucial primarily focus on individual objects, whereas scaling to step for robots to operate in everyday environments. Intelligent room-sized environments requires the robot to efficiently and robots rely on these models to understand the surroundings effectively explore the large-scale 3D space for meaningful and plan their actions in unstructured scenes. The robot discovers and facilitate mobile robots to localize themselves and navigate physically interacts with the articulated objects in the environment. Nevertheless, real-world manipulation would require Based on the visual observations before and after a robot to depart from reconstructing a static scene to the interactions, the robot infers the articulation properties unraveling the physical properties of objects.


Category-Independent Articulated Object Tracking with Factor Graphs

Heppert, Nick, Migimatsu, Toki, Yi, Brent, Chen, Claire, Bohg, Jeannette

arXiv.org Artificial Intelligence

Robots deployed in human-centric environments may need to manipulate a diverse range of articulated objects, such as doors, dishwashers, and cabinets. Articulated objects often come with unexpected articulation mechanisms that are inconsistent with categorical priors: for example, a drawer might rotate about a hinge joint instead of sliding open. We propose a category-independent framework for predicting the articulation models of unknown objects from sequences of RGB-D images. The prediction is performed by a two-step process: first, a visual perception module tracks object part poses from raw images, and second, a factor graph takes these poses and infers the articulation model including the current configuration between the parts as a 6D twist. We also propose a manipulation-oriented metric to evaluate predicted joint twists in terms of how well a compliant robot controller would be able to manipulate the articulated object given the predicted twist. We demonstrate that our visual perception and factor graph modules outperform baselines on simulated data and show the applicability of our factor graph on real world data.


Articulated Object Interaction in Unknown Scenes with Whole-Body Mobile Manipulation

Mittal, Mayank, Hoeller, David, Farshidian, Farbod, Hutter, Marco, Garg, Animesh

arXiv.org Artificial Intelligence

A kitchen assistant needs to operate human-scale objects, such as cabinets and ovens, in unmapped environments with dynamic obstacles. Autonomous interactions in such real-world environments require integrating dexterous manipulation and fluid mobility. While mobile manipulators in different form-factors provide an extended workspace, their real-world adoption has been limited. This limitation is in part due to two main reasons: 1) inability to interact with unknown human-scale objects such as cabinets and ovens, and 2) inefficient coordination between the arm and the mobile base. Executing a high-level task for general objects requires a perceptual understanding of the object as well as adaptive whole-body control among dynamic obstacles. In this paper, we propose a two-stage architecture for autonomous interaction with large articulated objects in unknown environments. The first stage uses a learned model to estimate the articulated model of a target object from an RGB-D input and predicts an action-conditional sequence of states for interaction. The second stage comprises of a whole-body motion controller to manipulate the object along the generated kinematic plan. We show that our proposed pipeline can handle complicated static and dynamic kitchen settings. Moreover, we demonstrate that the proposed approach achieves better performance than commonly used control methods in mobile manipulation. For additional material, please check: https://www.pair.toronto.edu/articulated-mm/ .


Learning Kinematic Models for Articulated Objects

Sturm, Jürgen (University of Freiburg) | Pradeep, Vijay (Willow Garage) | Stachniss, Cyrill (University of Freiburg) | Plagemann, Christian (Stanford University) | Konolige, Kurt (Willow Garage) | Burgard, Wolfram (University of Freiburg)

AAAI Conferences

Robots operating in home environments must be able to interact with articulated objects such as doors or drawers.  Ideally, robots are able to autonomously infer articulation models by observation.  In this paper, we present an approach to learn kinematic models by inferring the connectivity of rigid parts and the articulation models for the corresponding links.  Our method uses a mixture of parameterized and parameter-free (Gaussian process) representations and finds low-dimensional manifolds that provide the best explanation of the given observations.  Our approach has been implemented and evaluated using real data obtained in various realistic home environment settings.